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Abstract 


The  use  of  the  DFP  or  the  BFGS  secant  updates  requires  the  Hessian  at  the  solution  to  be  posi¬ 
tive  definite.  The  second  order  sufficiency  conditions  insure  the  positive  definiteness  only  in  a  sub¬ 
space  of  Rn.  Conditions  are  given  so  we  can  safely  update  with  either  update.  A  new  class  of  algo¬ 
rithms  is  proposed  which  generate  a  sequence  {**}  converging  2-step  q-superlinearly.  We  also  pro¬ 
pose  two  specific  algorithms.  One  that  converges  q-superlinearly  if  the  Hessian  is  positive  definite 
in  Rn  and  it  converges  2-step  q-superlinearly  if  the  Hessian  is  positive  definite  only  in  a  subspace. 
The  second  one  generates  a  sequence  converging  1-step  q-superlinearly.  While  the  former  costs  one 
extra  gradient  evaluation  the  latter  costs  one  extra  gradient  evaluation  and  one  extra  function 
evaluation  on  the  constraints. 


Key  words:  Constrained  Optimization,  Convergence  Theory,  Quasi-Newton  Methods,  Rate  of  Con¬ 
vergence,  Multiplier  Methods. 


^INTRODUCTION 


This  paper  considers  the  following  equality  constrained  minimization  problem: 
minimize  j[x) 

subject  to  (1.1) 

si  A  —  0 

where  f:Rn  — ►  R,  and  g:Rn  — ♦  Rm.  Let  g  =  igk,  .  .  .  ,fm)‘. 

We  define  the  augmented  Lagrangian  L:RnX  RMX  R+  -*  R 

L(x,X,c)  =  f(x)  +  g(x)lX  +  (c/2)g(x)fcg(x)  . 

For  c  equal  to  zero,  the  augmented  Lagrangian  reduces  to  the  Lagrangian  function  which  we 
denote 

l(x,X)  =  f(x)  +  g(x)*X. 

If  x^€Rn  is  such  that  Vg(x*)  is  full  rank,  then  a  necessary  condition  for  x#  to  be  a  solution 
of  (1.1)  is  that  there  exists  X^  such  that 

VxL(x,,X,,c)  =  0  (1.2) 

S(x*l  —  0, 

and  is  unique.  It  may  be  noted  that  the  constant  c  does  not  affect  condition  (1.2),  therefore  the 
constant  c  will  have  the  value  zero  unless  it  is  specified  otherwise.  Let  {zt}  be  a  sequence  which 
approximate  x * 

To  simplify  the  notation  let 

V?(*t)  =  V9t,  and  vs(z.)  =  VJ. 

Ac,  =  v^(z*X„c) 

A,  =  V?(x.  ,X.  ) 

Further,  let 

N[x)  =  {  y  €  Rn  :  Vfl(*)‘y  -  0  } 

and  N.  =  N[x,)  and.  Nk  =  N{xt).  All  through  the  paper  we  will  be  working  with  the  following 
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assumptions: 

Al.  The  functions  f,  and  g  have  second  derivatives  which  are  Holder  continuous  of  order 
p  6  (0,1)  in  a  neighborhood  0  of  z„ 

A2.  The  solution  x*  is  a  nonsingular  point  of  problem  (1.1),  i.e. 

(1)  Vg*  has  full  rank, 

One  of  the  most  successful  methods  for  solving  problem  (1.1)  is  the  Diagonalized  Quasi- 
Newton  Multiplier  Method  (DQMM)  as  defined  in  Tapia  (18). 


For  k— 0,1,2, ... 

Xt+i  =  U(xk  ,Xk  ,Bk  )  (1.3.a) 

B^t=  -  v^XkAk+i)  U-3-1*) 

Vk  —  Vjl(xk  +  sk  ,Xk+l)  -  V*  l(xk  Ak+i)  (13  c) 

Bt+ 1  =  B(sk  ,yk  ,Bk  ).  (1.3.d) 

Zk+i  —  xi+8t  (1.3.e) 


where  U  is  a  multiplier  update  formula  [18],  and  B  is  a  secant  update  formula  (4).  Fontecilla- 
Steihaug-Tapia  [10]  shows  that  under  the  assumptions  stated  above  and  the  nonsingularity  of  A, 
we  can  get  local  q-superlinear  convergence  of  the  sequence  {z*}  if  in  (1.3.a)  we  use  the  Newton 
multiplier  update  formula  and  in  (1.3.d)  we  use  the  Broyden  or  the  PSB  least  change  secant 
updates.  In  case  the  DFP  or  the  BFGS  least  change  secant  updates  are  used  in  (1.3.d)  the  positive 
definiteness  of  the  Hessian  A,  is  required. 

Our  assumptions  guarantee  that  the  Hessian  A,  is  positive  definite  in  the  subspace  N*  Therefore, 
it  is  not  obvious  whether  we  can  keep  the  same  rate  of  convergence.  However,  numerical  experi¬ 
ments  given  by  Bertochi-Cavalli-Spedicato  [l],  and  Tapia  [18]  show  that  we  can  safely  use  the 
DFP/BFGS  secant  updates  with  the  Newton  multiplier  update  when  the  Hessian  A,  is  positive 
definite  only  in  N * 

Few  theoretical,  and  practical  algorithms  have  been  given  in  this  area.  Powell  [16]  was  the  first 
one  who  attacked  this  problem  by  adapting  the  BFGS  in  such  a  W3y  that  it  maintains  the  posi- 
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tive  definiteness  throughout  the  process.  Assuming  local  q-linear  convergence  on  xh  Powell  gives  a 
sufficient  condition  to  obtain  2-step  q-superlinear  convergence  on  x*,  but  he  does  not  show  that  his 
modified  BFGS  satisfies  that  condition.  Instead,  he  could  only  get  R-superlinear  convergence. 
Coleman  and  Conn  [5]  give  a  new  algorithm  based  on  the  DQMM  idea  updating  the  multipliers 
with  the  Projection  multiplier  update.  They  have  to  construct  an  orthonormal  basis  (Zk)  for  the 
tangent  space  of  the  constraints  that  will  be  used  as  a  projection  operator.  They  need  to  project 
the  step,  and  the  difference  in  gradients  in  order  to  work  with  a  projected  DFP/BFGS  secant 
updates.  They  prove  that  the  sequence  {2*}  converges  to  2,  2-step  q-superlinearly. 

Our  work  differs  greatly  on  theirs.  However,  we  will  prove  under  what  conditions  Powell’s 
sufficient  condition  for  2-step  q-superlinearity  is  satisfied  as  well  as  giving  a  new  class  of  algo¬ 
rithms  that  are  2-step  q-superlinear  convergent  without  using  any  projection,  or  projecting  only 
the  step.  The  algorithm  given  by  Coleman  and  Conn  can  be  viewed  as  a  particular  case  of  this 
class. 

In  this  paper,  we  use  the  general  convergence  theory  developed  by  Fontecilla-Steihaug-Tapia  [10] 
for  the  DQMM  in  order  to  construct  a  new  class  of  algorithms,  called  2-step  algorithms,  that 
satisfy  the  characterization  of  q-superlinear  convergence  of  the  sequence  {xt}. 

In  Section  2,  a  new  result  on  the  theory  of  secant  updates  is  given.  We  consider  this  result 
to  be  our  main  contribution  to  this  area.  We  prove  that  the  DFP/BFGS  maintains  all  the  proper¬ 
ties  found  by  the  Broyden-Dennis-More  theory  when  the  Hessian  is  positive  definite  only  in  a  sub¬ 
space  of  Rn  as  long  as  the  step  remains  in  the  subspace  corresponding  to  the  current  iterate,  i.e. 
A,  being  positive  definite  in  N,  we  just  need  the  step  to  be  in  N*.  Using  this  result  in  Section  3, 
we  construct  a  new  class  of  algorithms  that  will  satisfy  the  two  sufficient  conditions  to  obtain  q- 
superlinear  convergence.  First  the  current  step  is  in  iV*,  and  also  we  satisfy  the  linearized  con¬ 
straints  property 

St  +  VfU  t  =  0 

which  is  fundamental  for  q-superlinearity.  In  Section  4,  we  prove  that  the  algorithms  given  in  Sec¬ 
tion  3  generate  a  sequence  {zt}  that  converges  to  2,  2-step  q-superlinearly.  Some  of  them  are 
proved  to  be  equivalent  to  be  using  the  DQMM  with  the  Newton  multiplier  update  formula.  In 
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Section  5,  we  give  our  main  contribution  to  the  area  of  constrained  optimization  by  finally  con¬ 
structing  an  algorithm  that  take  advantage  of  the  positive  definiteness  of  the  Hessian  A *.  This 
algorithm  is  characterized  by  the  fact  that  if  A,  is  positive  definite  on  the  whole  space  (i.e. 
j fai  >  0  for  all  k)  then  it  will  converge  Q-superlinearly  to  x,  the  reason  being  it  is  the  DQMM 
with  the  Newton  multiplier  update  formula,  and  if  A,  is  positive  definite  in  N,  (i.e.  yUt  <  0  for 
some  k)  then  we  switch  to  a  2-step  algorithm  that  will  be  at  least  2-step  q-superlinear  convergent. 
Moreover,  the  switching  from  one  algorithm  to  the  other  costs  just  an  extra  gradient  evaluation. 


Definitions  and  General  Results. 


In  the  following,  two  norms  will  be  needed.  ||.||p  will  denote  the  matrix  Frobenius  norm,  and  |.| 
will  denote  either  the  10  norm  or  its  induced  matrix  norm,  for  Rn  as  well  as  for  Rm. 

Definition  1.1:  Consider  U:R“  xRm  XR“x#  ->  2R“.  We  say  that  the  multiplier  update  for¬ 
mula  U  is  x-dominated  if  for  all  B,  €  /?"XB  there  exists  an  open  neighborhood  N2  containing 
(x*X  and  a  positive  constant  <j>  such  that  for  all  €  iVj,  and  for  all  X+  6  U(x  ,X,B  ) 

|vj.(V-MI  <  0  |*-*.|  (14) 

From  A1  we  know  that  for  a  fixed  c  >  0  there  exists  ~t  >  0  such  that 

|V?X,(z,X»c)  -  Vi£(z.A*c)|  <  7  !*  -  (L5) 

for  all  z  €  D.  Where  0  and  p  are  as  in  Al.  The  next  two  lemmas,  which  will  be  used  throughout 

the  paper  can  be  found  in  Dennis  and  Schnabel  [8j. 

Lemma  1.2:  Let  F:Rn  -*  Rn  be  continuously  differentiable  in  the  open  convex  set  DCRn  con¬ 
taining  x.  Assume  F1  is  Holder  continuous  of  order  p  €  (0,1]  in  D,  and  F  (z,)_1  exists.  Then 
there  exist  constants  t  >  0,  p  >  0  such  that 

!|v-  u|  <  \F\v)-F[u)\  <  p\v-  u| 

P 

for  all  u,v  €  D  for  which  max  {|v-xj,|u-xj}  <  £. 


(1.6) 
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Lemma  1.3:  Let  F  satisfy  the  same  conditions  of  Lemma  1.1.  Then  for  any  u,  v  €  D  there  exists 
a  positive  constant  K  such  that 

|F(t>)  -  FJu)  -  F  (X')(v  -  u)|  <  K  o(u,v)T  ju  -  u|. 

where  cr(u, t?)  =  max  (|v  -  x\,\u  -  xj[}. 

The  following  result  is  from  Fontecilla-Steihaug-Tapia  [lOj. 


Lemma  1.4:  Assume  Al-A2.For  any  e  >  0  there  exist  positive  constants  K±,  and  e  >  0 
such  that  for  any  X  £  Rm,  and  <r{x,x+)  <  e  we  have 

|yfL(2+,X,c)  -  vr.L(i,X,e)  -  AJ( x+  -  z)|  <  (1.7) 

<  {Kao(x,x+y+Ki\\-\41\x+-xl 


where  cr(x,x+)  =  max  (|z-  xj[,\x+-  xj }. 

Definition  1.5:  Let  {z*}  be  any  sequence  which  converges  to  z,.  Given  continuous  real-valued 
functions  g,  and  h  we  write 

g(zt)  =  o(h(zt))  as  k—oo 


if 


..  fat) 

lim  SUp  — — 7  «=s  0. 
t— 00  h(xt) 


All  throughout  the  paper  we  will  be  using  the  DFP  or  the  BFGS  secant  updates  given  by 


gorp  =  B  +  (y-  Bs)y‘  +  y(y-  Bs)1 

+  y'« 


B+fgs  =  B  + 


W 

t 


(MM 

s‘Bs 


Ay  -  b*W 
W 


and 


(1.8) 

(1.9) 


For  ease  the  notation  of  those  secant  updates  which  depend  on  the  step  s,  and  the  difference  on 
gradients  y  we  will  denote 

B+  =  DFP/BFGS(s,y), 
where  y  —  V,l(x  +  s  ,X+)  -  yx  l(x  ,X+). 

Let  cbe  such  that  Ac,  is  positive  definite. 


2.PROPERTIES  OF  THE  DQMM. 


We  will  follow  the  theory  developed  by  Broyden,  Dennis  and  More  [4]  for  the  DFP  (1.8)  and 
the  corresponding  theory  develop  by  Stachurski  [17]  for  the  BFGS  (1.9).  Their  results  can  be  sum¬ 
marized  in  the  following  lemma. 


Lemma  2.1:  Let  M  be  a  symmetric  nonsingular  matrix  of  order  n  such  that 

| My  -  Ar1*]  <  P  jAr1*!  (2.1) 

for  some  0  G  (0,—)  and  vectors  y  and  s  in  Rn  with  0.  Then  y‘s  >  0  and  B+  is  well  defined  by 
3 

the  DFP/BFGS(s,y).  Moreover,  there  exist  positive  constant  at,ax,  and  a2  such  that  for  any  sym¬ 
metric  matrix  A  of  order  n 


||B+-A|U<1(1- 


aed2)1^  +  c*! 


]My- A r'sj 

|Arl«[ 


\\\B-A\\U 


+  Oj 


|  y  -  Ag] 
\At1s\ 


where  ||  Q\\M=\\MQM\\r,  a0G(0,l),  and 


(2.2) 


\M[B-  AH 

!|B-AyArls|  forB^A  (2.3) 

0  otherwise 


For  the  remainder  of  the  paper  we  will  also  assume  the  following. 

A3.  The  multiplier  update  is  x-dominated. 

In  order  to  satisfy  (2.1)  the  Hessian  we  are  approximating  must  be  positive  definite.  This  is  not 
case  here  as  our  assumptions  indicate.  The  Hessian  A,  is  positive  definite  only  in  N+  Hence,  we 
will  not  be  able  to  satisfy  (2.2)  unless  we  find  a  positive  definite  matrix  A  and  a  matrix  M  satisfy¬ 
ing  (2.1).  The  following  theorem  gives  the  answer  to  this  problem. 

For  given  x,  a  G  Rn  and  X+  G  Rm  define 
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y  =  v,l(x  +  s  ,X+)  -  Vx  K*  A+). 

Theorem  2.2:  There  exists  a  symmetric  and  positive  definite  matrix  A  such  that  for  x-dominated 
multiplier  update  formulas  there  exist  an  open  neighborhood  Nx  containing  (x*X„A),  and  nonne¬ 
gative  constants  a„  ax,  and  a2  suc^  that  f°r  (*,X+,B)  6  Ni  if  * G  A^x)  then 

B+  —  DFPj BFGS[s,y)  satisfies 

II B+  -  A\\u  <  {(I  -  a.O2)1'2  +  axo{x,x  +  *)j  ||B  -  A\\„ 

+  a«j{x,x  +  s).  (2.4) 

Proof:  We  will  prove  that  (2.1)  is  satisfied.  Consider 

\My  -  AT1*!  <  |M"l|  \y  -  AT2s|.  (2.5) 

Since  A\  is  a  symmetric  positive  definite  matrix  there  exists  a  symmetric  nonsingular  matrix  M 

such  that 

A‘,  =  AT2 

Using  the  definition  of  Ac.  we  get 

|2/  _  AT2s|  =  \y  -  AU\  —  \ y-  At  -  cyj.VjU |. 

Since  s  6  N(x)  we  get 

| V  -  AT2<?|  <  \y  -  Aj\  +  c|  VffJ|Vff«  -  V?1M- 
From  A1  there  exists  Kx  such  that 

\y-M-2s\<\y-A,s\  +  Kx\x-xM-  (2-6) 

Using  Lemma  1.6  there  exist  positive  constant  K &  and  Kt  so  that 

|j/  -  A^|  <  [A2< 7(2,2  +  s)”  +  A.|X+-  X  JIM.  (2.7) 

Since  A3  we  get 

|l/  -  A^s |  <  A4  <7(2,2  +  $)|3|  (2.8) 

for  some  positive  KA.  Combining  (2.5),  (2.6),  (2.7),  and  (2.8)  there  exists  a  positive  constant  Kb 

such  that 

|A/y  -  Arl5|  <  K6  <r(x,2  +  s)\Arls\  (2.9) 

with  A'6  =  | A/-1] [/C4  +  Aj|A/|].  Using  the  techniques  of  Broyden,  Dennis  and  More  [4]  we  have  the 

following.  By  Lemma  1.4  there  is  an  e  >  0  and  p  >  0  such  that  (1.6)  holds  if  <7(2,2  +»)<«.  Set 
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Ns  =  {B  6  ASj  <  1/2} 


N,  -  {x  €  *4  <  j  and  2|(A‘)~1m  +  p||*-  x\  < 

and 

N6  =  {X  €  i?*:|VJ^X  -  X,)|  <  <t>\x  -  zj}. 

Then  JV  =  N4X  N6X  Ns  is  a.  neighbourhood  of  (2„X*Ae,)  and  if  (2,X+,i?)  €  A*  ,  then  by  the 
Banach  perturbation  Lemma  the  matrix  B  is  nonsingular  and 

<  2  KAin. 

Using  equation  (1.6)  and  A3  we  get 

H  =  |F-1Vll(x,X+)|  < 

<  |B-X(v*i(x  ,X+)  -  V*  l(x.  ,X+)]I  +  |B  ^Iv*  l(x.  ,X+)  -  Vx  l(x«  ,X.  )]| 

<  plB-'Ux  -  xj  +  tlB-'Hx  -  x.1 
<2\(Arilt  +  fillx-xJ<j. 


and  therefore 


|*+'»-  *«!  <  M  +  <  £- 

Hence,  from  (2.9)  we  have  that  (2.1)  always  holds  and  we  obtain 


||B+  -  ASH*,  <  K1  -  a^2)172  +  axo{x,x  +  *)j  ||5  -  A5j|i/ 

+  02*7(2,2  +  s) 

which  implies  (2.4)  with  A  =  Ai. 


Q.E.Do 


Note  that  although  (2.4)  is  relative  to  Ac„  the  difference  in  gradients  used  (i.e.  y)  does  not  depend 
on  c.  In  this  point  leans  all  the  theory  that  we  are  about  to  develop.  Before  stating  the  following 
theorem  we  need  to  clarify  the  point  s  —  0.  Having  the  multiplier  update  x-dominated  and 
assuming  convergence  then  we  have  that  $  =  0  if  and  only  if  x  —  J#.  Therefore,  throughout  the 
paper  we  will  have  sj^O. 

Now  the  question  is  obvious,  can  we  find  x-dominated  multiplier  updates  that  make  the  step  s  to 
be  in  N[x)“!.  The  answer  is  given  by  the  following  result. 
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Theorem  2.3:  Let  s  be  a  vector  in  R*  such  that 

B  s  =  -  VjI(x  ,X+) 

for  some  X+  €  Rm ■  Then  s  6  N(x)  if  and  only  if  X+  is  given  by 

X+  —  -  (vj'5'1V?r1V5t5"1V/-  (2.10) 

Moreover,  the  multiplier  update  (2.10)  is  x-dominated. 


Proofs  Consider  s  =  -  >X+)-  Then 

=  -  Vj'^V/ -  Vff'S'VjX^.  (2.11) 

Substituting  (2.10)  in  (2.11)  we  obtain  that  *  —  0  hence,  t  6  jV(z).  Conversely,  we  equal  to 

zero  (2.11)  and  we  get  (2.10).  To  prove  (2.10)  is  x-dominated  we  use  the  techniques  of  Fontecilla, 

Steihaug  and  Tapia  [1C],  It  is  straightforward  to  prove  that 

lv^x+-x.)|<|pji|^j|z-zj 

with  Pj  =  B"1vj(V5fP‘1Vj)“lVff,I  and  PB  —  I  -  P%.  Therefore,  (2.10)  is  x-dominated  with 

0=inP4-  (2-12) 

Q.E.D. 

We  will  call  (2.10)  the  null-space  multiplier  update. 


Define  P(z)  —  I-  aa  the  orthogonal  projection  onto  N(x)  and  let 


Pk  =  P[xc)  and  P.  =  P(x.). 


Theorem  2.4:  Let  the  sequences  {zt}  and  {Xt}  be  generated  by  the  DQMM  with  (1.3.a)  given  by 
(2.10).  Then  if 


t-  IX 

£  I**  -  *4  <  +  °o 

fc=0 

then 


(2.13) 


lim 

k  —  oo 


\PXBk-A.)8k\ 

W 


=  0. 


(2.14) 


Proof:  It  is  a  direct  consequence  from  Theorem  (2.2).  Using  the  same  techniques  than  Broyden, 
Dennis  and  More  (4j  relation  (2.4)  together  with  (2.13)  yield 
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lim  &t  =  0 
t  -»  oc 


where  6 1  = 


.  Hence, 


since  |P,|  =  1  we  get 


lim 

k  —  cc 


|(B*  -  Ai)»J 

W 


=  o 


! P+Bt  -  A.M  -  I PJ[Bk  -  A‘)n|  <  |(Bt  -  Atyi 1 
and  since  lim  Pt  =  P,  we  obtain  (2.14). 

k  —  CC 

Q.E.D. 

There  are  other  multiplier  updates  which  are  x-dominated.  For  those  multiplier  updates  which 
due  to  Theorem  2.3  do  not  satisfy  that  the  step  s  is  in  N(x)  we  have  the  following  result. 


Theorem  2.5s  Let  the  sequences  {**}  and  {X*}  be  generated  by  the  DQMM  with  (1.3.e)  given  by 
xt+1  =  xk  +  If  (2.13)  holds  then 


lim 

k  ->  OO 


W 


—  o. 


(2.15) 


Proofs  Assume  Pts^ 0.  Let  wt  =  P^*.  Since  xi+i  =  xt  +  ti>b  and  |P^J  <  |«t|  then  Theorem  2.2 
gives  us  the  bounded  deterioration  (1.4.b).  Assuming  (2.13),  (2.4)  yields 


timcc  |wt| 
since  |tt'J  <  |  sj  we  get  (2.15). 

If  P*3t  ~  0  then  directly  (2.15)  holds. 


0 


QJBJ3. 


Note  that  Powell’s  sufficient  condition,  i.e.  (2.15),  for  having  2-step  q-superlinear  convergence  is 
satisfied.  Having  conditions  (2.14)  and  (2.15)  using  the  DFP  or  the  BFGS  secant  updates  assuming 
that  the  Hessian  is  positive  definite  only  in  N,  is  the  first  step  to  get  q-superlinear  convergence  of 
the  sequence  {**}  in  the  DQMM.  Is  a  fact  that  we  also  need  to  satisfy  condition  (2.13). 


a  O 


3.PROPOSED  ALGORITHMS 


In  spite  of  the  lack  of  positive  definiteness  on  A,  Section  2  gives  us  a  sufficient  condition  to  be 
satisfied  by  the  step  we  are  using  to  update  the  DFP/BFGS  in  order  to  get  relations  (2.4)  and 
(2.14).  Following  Fontecilla-Steihaug-Tapia  [10]  two  conditions  are  necessary  to  obtain  q- 
superlinear  convergence  of  the  DQMM.  They  are 


hm  - ; — | - =  0 

1**1 

,.  |VJ.0*+i| 

hm  — ; — i - —  0. 

r-oo  jftj 


(3.1) 


(3.2) 


We  know  that  if  the  step  we  are  using  to  update  the  DFP/BFGS  is  in  Nk  then  (3.1)  holds.  On  the 
other  hand  (3.2)  holds  if  our  algorithm  satisfy  the  linearized  constraints  property,  i.e. 


Si  +  VsUt  —  o. 

The  most  natural  way  to  satisfy  (3.3)  is  having  the  step  in  the  following  form 


(3.3) 


«t  =  ~  Vff&h  (3.4) 

where  is  a  right  inverse  of  Vfft  that  is  given  by 

—  QVsA'VSiQVSk)'1  (3.5) 

for  an  nX  n  matrix  Q  such  that  VJtQVJ*  is  nonsingular.  The  most  natural  consideration  for  the 

step  sk  to  be  in  Nk  as  well  as  to  satisfy  (3.3)  is  to  consider  steps  of  the  form 


9k—  wk  +  vk  (3.6) 

where  wk  6  Nt  and  it  will  be  used  to  update  the  DFP/BFGS,  and  vk  satisfies  (3.4).  We  obtain  the 

general  form  of  the  algorithms  proposed,  called  2-step  algorithms. 


2-step  algorithms. 

Given  x„,  X0,  and  B„. 


For  k=0,l,2,... 
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Xt+i  —  U(xk  ,Xk  ,Bk  ) 

(3.7 .a) 

Bkhk  =  -  Vil(xk  iXk+i) 

(3.7.b) 

Wk  =  Pkfrk 

(3.7.c) 

Vk  =  ViUxk  +  wk  Ak+i)  -  Vx  l(xk  Ak+i) 

(3.7  .d) 

Bk+\  —  DFPj  BFGS{wk,yk) 

(3.7.e) 

vk~  -  Vst3k 

(3.7.f) 

Xk+1  =  xk  +  wk  +  vk 

(3-7  .g) 

We  point  out  that  for  the  null-space  multiplier  update  formula  step  (3.7.c)  is  unnecessary  since 
hk  £  Nk.  If  wk  —  0  in  (3.7.c)  we  go  to  (3.7.f).  There  are  two  natural  choices  for  the  matrices  Q  in 
(3.5),  Q  =  /,  and  Q  =  B?  which  give  the  following 

VP?  =  -  V^VtftVJt)-1  (3-8-») 

Vj£t  =  -  (3.8.b) 

With  these  two  choices  for  step  (3.7.f)  and  using  the  null-space  multiplier  update  formula  we  get 

the  following  algorithms. 

ALG1 

For  k=0,l,2,... 

Xi+ 1  =  -  (VtfiSiVffirVftSiV/t 

Bkwk  —  -  Vil(xk  .Xk+1) 

J It  =  V*l(xk  +  wk  ,Xk+1)  -  Vx  *(xk  -Xk  +i) 

Bt+i  =  DFP/BFGS(ujk,yVi) 
vk  —  - 


Xi+I  —  xk+  wk+  vk 
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ALG2 

For  k =0,1,2, ... 

Xfcw  =  -  (v^V^'V^VA 
=  -  V*l(xk  ,Xk+1) 

1/t  =  Vjl(xk  +  wk  ,Xk+1)  -  Vx  l(xk  ,Xk+1) 
Bk+i  —  DFP/BFG^w^) 

*t+l  —**+»*+»* 


Those  two  algorithms  have  the  following  properties.  From  Theorem  3.1  the  multiplier  update  for¬ 
mula  is  x-dominated.  Further,  consider  tk  =  wk  +  vh  Since  either  Pk3t  =  wk  for  ALG1,  or 
Ps^k  —  wt  for  ALG2  then  there  exists  a  positive  constant  Kt  such  that 


(3.11) 

In  either  case  from  Theorem  2.3  wk  6  Nk,  and  therefore  we  have  relation  (2.4)  and  assuming  (2.13) 
as  in  Theorem  2.4  we  can  prove,  since  (3.11)  holds  that 


,.  \PiiBk- A,)wk\ 
hm  - ; — ; -  =  0. 

k  -*  oo  I S  J 


(3.12) 


Moreover,  since  wk  6  Nk  the  step  sk  satisfies  (3.3).  We  thus  have  all  the  ingredients  to  get  q- 
superlinear  convergence. 

We  have  two  other  multiplier  updates  that  are  x-dominated.  They  are  the  Projection  update 


Xi+i  =  -  (VJtVfft)  VftVA.  (3.13) 

and  the  Newton’s  update 


X*+i  =  (Vg^VSkT^k  ~  V9‘kB?Vfk)-  (3.14) 

From  Theorem  2.3  those  multiplier  updates  will  not  generate  a  step  wk  in  hence  the  need  of 

projecting  the  step.  With  this  idea  we  get  the  following  algorithms. 
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ALGS 

For  k=0,l,2,... 

X*+i  —  -  (v?iV0t)-1VJtVA 
B't hk  —  -  ViMxk  Ak+l) 
wk  —  PBhk 

Vt  ~  Vjl(xk  +  wk  -Xk+i)  ~  Vx  Hxk  »Xk+i) 
Bm  =  DFP/BFG^wt,y^k) 

vk  =  -  vsifcvgWikY'tk 
*t+I  =  a  +  wk  +  vk 

.4LG4 

For  k=0,l,2,... 

X*+i  *  "  (VftVfft r'vjiv/i 

BjAt  =  -  ,Xk+i) 

Vt  =  Vj(xk  +  wk  ,Xk+1)  -  Vx  Hxk  >Xk+i) 
Bk+i  =  DFP/BFGS{wk,yWt) 
vk  *  - 

*t+i  =  xt+  U)t+  vt 


Where  PBi  —  I  -  is  a  projection  operator  onto  the  tangent  space  of  the 


constraints. 
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ALG5 

For  k=0,l,2,... 

Xt+1  =  (WtBi'vtkY'iSk-  VfftBiVA) 

BjJl);  —  -  Vll(Xk  Alc+l) 

Wt  =  P£i 

Vk  =  Vjl(*k  +  wk  Ak+i)  -  Vx  l(xk  Ak+i) 
Bm  -  DFP/BFG%whVvi) 

-  vadvgWskT^k 

2 1+1  —  xk+  wk+  vk 

ALG6 

For  k=0,l,2,... 

x*+ 1  =  -  vaiBiV/t) 

Bfik  —  ~  Vi^(xk  Ak+i) 

wk  =>  P*At 

1/t  =  Vxl(xk  +  wk  Ak+i)  -  Vx  l(xk  ,Xk+i) 

Bm  =  DFPIBFGS{whyVk) 

Vk  —  -  Bi1'vgAvgkBil'7Sk)~l9k 
*k+l  =  xk+  wk+  vk 


The  reasons  for  projecting  the  step  hk  in  ALG3  and  ALG4  with  instead  of  Pk  is  seen  in  the 
next  two  theorems.  For  ALG2,  ALG4  and  ALG5  we  obtain  the  following  result. 

Theorem  3.1:  Let  the  step  sk  from  ALG2,  ALG4  or  ALG5  satisfy 

Bk*k  =  -  Vil(Xk  ,*<)•  (3.15) 

for  some  \i  in  Rm.  Then  /i  is  the  Newton  multiplier  update  formula  (3.14). 

Proof:  From  ALG2  we  have  that 

Bkwk  =  -  v/t  + 


and  we  also  have 
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Bk vk  =  -  VJiiVJtBiVjtr^t- 
Recall 


BkSi  =  Blwk  +  t>t) 

Bt^t  =  -  V/t  -  VffiKv^BlV?*)-1^*  ~  V^BiV/i)]- 
using  (3.14)  we  get  then  (3.15).  For  ALG4  we  get 

Bkhk  —  -  vfk  +  V0t(V?iV^rV?iVA 
u>t  — 

Vi  —  -  Bi^g-ivgWvgkT^k 
Since  P^BiVfft  =  0  we  obtain 

U»i  =  -  PB&kVfk 

Bkwk  —  -  vA  +  VJtllVffiBiVjtrVffiBiV/il. 

.  Summing  on  both  sides  of  this  equation  we  get  our  desired  result.  From  ALG5  we  get 

Bkhk  =  -  VA  -  VtftKVfftBl1  VJ*)"1^*  -  VSkBfvfk)} 
wk  =  P*At 

Doing  some  algebra  on  the  first  equation  we  get 

hk  —  -  BfljBiVA  -  B^VfftJV^BiV?*)'1?*- 
Now  projecting  with  P*,  and  since  P^P^  =  Pgt 

u>t  =  PA  =  -  PBtBt1v /*  -  B^V^VplBlVst)"1?* 
so 

u>i  —  -  B^BlVA  -  BiVs^VJiBlVjt)"1?*  +  V5t(VPtVPt)'Vt 

wk=  ht-  vk. 

Therefore, 

A*  =  “>t  + 

Q.E.D. 

This  result  is  important  because  it  tells  us  that  the  DQMM  with  the  Newton  update  formula  and 
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ALG2/ALG4/ALG5  only  differ  slightly  on  the  matrices  Bk  s  and  although  they  do  not  generate 
the  same  iterates,  asymptotically  they  will.  This  fact  will  be  prove  in  section  5.  For  the  rest  of  the 
algorithms  we  get  the  following. 

Theorem  3.2:  Let  the  step  sk  from  ALGl,  ALG3  or  ALG6  satisfy  (3.15)  for  some  n  in  Rm.  Then 

**  =  -  SiV,l(xk  Ak+i )  ±  (vgB^  -  Vgk  +)gk  (3.16) 

where  \N  is  the  Newton  multiplier  update  formula  (3.14). 

Proof:  For  ALGl  we  have 

Btwt  =  -  vA  + 
and 

Bkvk  =  -  Bk^g^g‘kX7gk)-lgk. 

Then  we  get 

Btsk  =  -  VIl(xk  ,XkN)  +  Bk  (vgB^  *  VSk  +)gk 
which  implies  (3.16).  Algorithm  ALG3  yields 

w*  =  PbM  ~  ^VA  +  £iV?*(Vf*V?i)‘lV;£vA| j 

since  —  0  we  get 

—  -  B^vA  +  A 

which  yields  (3.16).  For  ALG6 

hk  —  ~  +  Vstk9n 

hence 

wk  =  -  PBiBklVfk  +  '79%9k  -  Vffljt- 
Therefore  after  adding  v*  we  get  (3.16). 


Q»E«D« 


4.CONVERGENCE  PROPERTIES 


In  this  section  we  will  prove  the  convergence  properties  share  by  the  2-step  algorithms  of  Section 

2. 

Theorem  4.1:  Under  assumptions  A1  thru  A2.  Assume  the  sequence  {z*}  is  given  by  either 
ALG2,  ALG4  or  ALG5.  Then  for  any  r  G  (0,1)  there  exist  positive  constants  e,6  such  that  if 

I*#  -  xj  <  e  and  | B„  -  <  5 

the  sequence  {x*}  is  well  defined  and  converges  to  z,  with 

|*t+i-  *4  <  r\xt-x 4. 

Moreover,  the  sequences  {|Bt|}  and  (IBt1!)  are  bounded. 

Proof:  By  the  equivalence  of  norms  in  Rnxn  we  have  that  for  any  A  G  Rnxn  there  exist  p  ,  rj  >  0 
such  that 

0  Pll  <  W  <  1 IMII 


Let  r  G  (0,1),  and  choose  t, ,  6  so  small  that  for 

p  >  P3i 

we  have  2  0  r?  S  <  1  , 

r  -  (1  -  2  3  t]  5)  ^  Kl  €'  +  K*  €r<t>  +  2  V  6  *  +  £^’ 

ej 

and  (2  a.  5  +  a») -  <  5. 

1  -  t*  ~ 

Now  select  6r  small  enough  so  that  ||B-Aij|  <  6  whenever  \B  -  Ac^  <  6r.  If  necessary 
further  restrict  5,  so  that  (x^+.X+.B)  G  Nlt  (x^+.B)  G  N2  whenever 
|B-  A^\  <  2  rj  5  ,  and  max  {|x-  zj  ,  |z+-  zj}  <  er. 


Let  j5„  -  A°\  <  b, ,  | z„  -  z,|  <  if,  from  the  Banach  Perturbation  Lemma  [15] 


|(*Sh  \Bt  -  <  0  V  ||B,  -  A'l\  <  t,  0  s  <  2  n  $  6  <  1  ; 

hence  B]1  exists,  and  there  exists  rp  >  0  such  that 

WZ-TTJ7T'  “d  «->in.|. 

where  VBi  —  |(/-  V^Vf'A^V^VfU**"1.  Furthermore, 

\PlBt  -  A,)|  =  jP^B,  -  Al) |  <  |(B,  -  4)|  <2  1,6. 

We  have 

*i  —  *.  -  ^Vil(xo  Ai) 


thus  from  standard  arguments 


*1-*.=  *?(V*l(x.  Ai)  -  Vx  Ita  Al)  -  a.  (x.  -  Xo  )) 

+  B^vJKx.  A.  )  -  Vx  l(x.  Ai)) 

+  {I-B-M,)(z,-z.). 

Now,  taking  norms  and  using  the  triangle  inequality 

l*i  -  x\  <  |#|  |Vjl(x.  Ai)  -  Vx  Ai)  -  A.  (x.  -  Xo  )| 

+  -  A.)(i„  -  z.)  -  V^«(Xi  -  X  ,)| 

Using  the  fact  that  for  the  Newton  multiplier  update  formula  we  have  for  all  k 

Vff/Xi+i  -  X.)  =  VP^VflBiVff -  A,)(xt-  z.) 

+  (**  -  *.) 

where  tt  =  K7 |r*  -  z,j  we  obtain 

l*i  ~*\<  1^1  I V^x,  Ai)  -  Vx  l(x0  Ai)  -  A,  (x.  -  x„  )| 

+  |B;l||(/-  VjUVff^VJ.rVjWS,  -  A ,)(z,  -  z,))| 

+  £,[*,- 

Since  Vjt  =  VgtP,  we  get 
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|2i  -  z«|  ^  IB;1!  |Vji(x»  Al)  -  Vx  Ai)  A,  (x,  *o  )| 

+  mi\VB'\\PJ(B,-A,){x,-z.))\  +  tlx.-zll 

Hence 

|*i  -  *-|  <  IB;1!  Iv^x,  ,Xi)  -  Vx  l(xo  Ai)  -  A.  (x.  -  x*  )| 

+  \Bl'\\\VBc\\PtB,-A.)\  +  eH\z.-zl 

Therefore, 

1*1  -  *4  <  IB;1!!  K!  eTr  +  Kz  t,  <t>  +  2  TI  8  ip  +  ejjz,  -  x\. 

The  bound  on  B?,  and  the  condition  on  r  yield 

|*i  -  *4  <  r\z,-xl 

Now  by  induction,  assume  for  4=0,1,  .  .  .  ,m-l 

||B*- A5j|  <  2$,  and  |zm  -  z\  <  r  \zk  -  z\. 

From  (1.3. b)  we  have 

||Bi+1  -  A‘4|  -  ||5t  -  A%\  <  2  a,  8  e?  r>*  +  a2  ef  r* 

summing  both  sides  from  4=0  to  m-1  we  obtain 

||Bm  -  Ail|  <  ||fl,  -  A%\  +  (2  <M  +  a2)  <  2  8 

so  |J3m  -  A  <  2  n  &,  and  \P{Bn  -  A,)|  <2  r,  8. 

a 

Using  the  Banach  Perturbation  Lemma  exists,  and  j#4|  <  - — — — -. 

1-2  p  T)  0 

We  complete  the  induction  by  observing  that  for  m  =  0 

l*m+i  -  *4  <  IB-411  #1  +  A2  er  4>  +  2  V  &  if  +  ej. 

The  bound  on  B%,  and  the  condition  on  r  yield 

l**+i  -  *4  ^  r  !*«  -  *4- 

Q 

Notice  that  the  sequence  {|B4|}  is  always  bounded  by  - — — — -  ,  and  for  all  m  we  have 


that 
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|£g  <  2  1,  5  +  \A% 


Q«E«Ds 

For  the  rest  of  the  2-step  algorithms  we  will  prove  that  the  sequence  {x*}  verifies 

I*i+1  -  <  r  -  x\  (4.2) 

for  some  r  g  (0,1).  Note  that  (4.13)  is  2-step  q-linear  convergence  and  it  implies  (2.13). 

Theorem  4.2:  Assume  A1  thru  A2.  Let  {x*}  be  generated  by  ALG1,  ALG3,  or  ALG6.  Then  for 
any  r  €  (0,1)  there  exist  positive  constants  c,S  such  that  if 

\x,  -  x\  <  e  and  \B,  -  <  8 

the  sequence  {xk}  is  well  defined  and  converges  to  x,  with 

<  r  |*w-  *4- 

Moreover,  the  sequences  {| }  and  {IBJ1)}  are  bounded. 

Proof:  Choose  r  g  (0,1).  By  the  equivalence  of  norms  for  any  matrix  A  g  R”x "  there  exist  posi¬ 
tive  constants  p,  rj  such  that 

m\  <  w  <  v\\a\\. 

Choose  e„  6  so  small  that  for 

fi  >  K^n 

we  have 

2 vPS  <  1  (2ax«  +  a2) - —  <  6 

1  -r* 

r  >  /C10{— — +  Kntr<l>  +  20t)6<I>  +  ej  +  K9tr 
1  2  Tf  po 

Now  select  6r  small  enough  so  that  ||Bt-  A5j|  <  6  whenever  \Bk-  Aij  <  Sr  If  necessary  further 
restric  t„  6r  so  that  (x.x+.X+.B)  g  Nk,  (x.X+.B)  g  N2  whenever  |5t  -  <  2 tj6  and  <r{x,x+)  <  er 

Let  \Be  -  <  S„  and  |x,  -  x «{  <  er,  from  the  Banach  Perturbation  Lemma  we  have 

PST1!!*.  -  <  0n\\B,  -  A%\  <  foS  <  2fo6  <  1 


then  B?  exists  and 
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Furthermore, 


\B?\< 


ri> 

1  -  2 rjnjc  ' 


\PiB,-A')\  <2  nS. 

Using  the  techniques  of  Theorem  2.2  wc  get  that  -  x\  <  t  and  then  with  (1.4.b) 
j|Bj  -  A^\  <  26  and  so 

\PIBX-A.)\<2r>6,  (fifl  <  r~l~,  and  *>|VSJ. 

From  (3.16)  we  get 

|zo-  x\  <  IBI1! vJ(x.  ,X2)  -  Vx  l(x  i,X2)  -  A,  (x.  -  x  j)]| 

+  -  A,){xx  -  x.)  -  Vff^X2  -  X,)|] 

+  Hvtf-  V$£]ft| 

From  (3.3)  and  (4.1) 

1*2  -  *J  <  l^llv^x.  ,X2)  -  Vx  l(x  1,X2)  -  A.  (x.  -  x  i)| 

+  ~  A.)\  +  tJlxj  -  x\ 

+  |v$f  ~  Vffjjlft  -  9c  ~  VS*(*i  -  **)| 

Using  Taylor's  Theorem  on  the  last  term  of  the  right  hand  side 

I  ft  ~  ft  -  i  -  *»)|  <  A8|zj  -  x,|2. 

for  some  positive  K%.  Now 


We  get 


Ivjf  -  VfftJIft  -  ft  -  V^(*i  -  *,)\  <  K9\x,  -  x,|2 
<  etK9\x,  -  x\. 


\xg  -  xj  <  [j^KATief  +  K9t,(j>  +  2fa6<t>  +  eJlzj  -  x\  +  K9i,\z,  -  x\ 
Since  |zi  -  x4  <  Ki0  |z„  -  zj  we  get 

\*t  -  *J  <  \K^Bil\{Kxt’  +  K^,<t>  +  2PvM  +  ej  +  Kaer]\x,  -  z\ 
The  bound  on  |  Sj1)  and  the  condition  on  r  give 


|*2~  x.}  <  r\xt-  zj. 

Now  by  way  of  induction  assume  for  k=l,...,m-l 
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From  (1.4.b) 


||B*  -  A5||  <  26  and  |zi+1  -  zj  <  r  |zM  -  zj 


k  k 

||Bt+1  -  Al ||  -  ||J3t-ASj|  <  2 aisS*  +  a^r** 


so 


II Bm  -  Al j|  <  ||B,  -  A*j|  +  (2 a,5  +  ^2) — -~y  <  25 

1-r2 


therefore,  exists 


As  for  m  =  0  we  get 


\P+Bm  -  A.) |  <  2t?5  and  \B^\  <  — 


2rpt]6 


l*«+t  -  <  r  l*i»-i  -  **!• 

a 

The  sequence  { ] jB*1  | }  is  always  bounded  by  - —  ■ and  for  all  k  we  have  that 

1  -  2 pT}0 

|BJ  <  2i?5  +  [A% 

For  the  rest  of  the  section  assume  the  following. 


Q.E.D. 


A4.  The  iterates  xk  €  fl  and  lim  xk  =  z* 

k  —  co 


Theorem  4.3:  Assume  Al  thru  A4.  Let  the  sequence  {zj}  be  generated  by  the  2-step  algorithms. 
Then  if 


,.  \PABk-A.)wk\  n 

am  - ; — j - =  0 

k  -  00  J 

then  the  sequence  {z*}  converges  to  z«  2-step  q-superlinearly,  i.e. 

,.  l*m-*4 

lim  - r  =  0 

*-00  [zw  -  z.| 

Proof:  Following  Theorem  4.4  from  Fontecilla-Steihaug-Tapia  [10]  we  have  that 
|**+i  -  *4  <  \P'ABk-  ^.Kl  +  ^11  |V0.Ji+i!  +  *(N)- 

From  the  q-linearity  and  (3.16)  there  exists  a  positive  constant  Ka  such  that 


(4.3) 


(4.4) 


(4.5) 
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M  <  -  *«!• 

Dividing  (4.5)  by  |zw  -  z\  and  using  (4.6)  we  have 

k+i-*4  ^  IWt-A.)**]  .  Ivwml  .  °(M) 

-  12  W  Kli  W  "kP 

Since  (3.3)  is  always  satisfied  the  last  two  terms  of  the  right  hand  side  are  c(|ai|).  Therefore, 

l*H*  -  *4  ^  ^  \Pi{Bt-A,)8^  _  odstj) 

S  -^12 - 1 — ; - + 


(4.6) 


|*M  -  **1 

Using  (4.6)  we  get 


N 


W 


(4.7) 


l*w|  <  Ku\Xir-\  -  *.!• 

Since  st  —  wt  +  vt  and  vt  is  either  (3.8. a)  or  (3.8.b)  we  have  either  /Vt  =  wt  or  PBtst  =  wt 


which  imply 


Therefore, 


M  <  KM. 


l**+i“*4  ^  ^  |P*(S*  -  A,)u>d  ^ 
TPPPT  -  *19  H  +  *171PT 

.  «M) 


(4.8) 


W  * 

Now  from  (3.3)  we  have  that  gt  =  oflst-il)-  Therefore,  taking  limits  on  (4.8)  and  using  (4.3)  we 
get  our  desired  result. 

Q.E.D. 

We  can  now  summarized  our  results. 


Theorem  4.4:  Assume  A1  thru  A4.  The  sequence  {zt}  generated  by  the  2-step  algorithms  con¬ 
verges  to  x,  2-step  q-superlinearly. 

Proof:  It  is  a  direct  consequence  of  Theorem  2.4  since  for  all  the  2-step  algorithms  wt  €  Nk  and 
(2.13)  is  always  satisfied. 


Q.E.D. 
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5.MODIFIED  DIAGONALIZED  QUASI-NEWTON  ALGORITHMS 


In  this  section  we  modify  the  DQMM  to  construct  two  new  algorithms  each  one  of  them  generat¬ 
ing  a  sequence  {z*}  converging  1-step  q-superlinearly  when  the  Hessian  is  not  positive  definite. 
The  first  one  is  a  combination  of  the  DQMM  using  Newton’s  update  formula  and  a  2-step  algo¬ 
rithm,  specifically  ALG2.  The  second  one  is  constructed  using  the  idea  developed  by  Coleman  and 
Conn  [5]  and  also  by  Gabay  [11].  The  former  costs  one  extra  gradient  evaluation  over  the  DQMM 
whereas  the  latest  costs  one  extra  gradient  evaluation  and  one  extra  function  evaluation  on  the 
constraints. 

The  Modified  Diagonalized  Quasi-Newton  Method 

From  Theorem  3.1  the  step  given  by  the  DQMM  using  Newton’s  update  formula  is  of  the  form 
sk  —  wk  +  vk 

with  wk  £  Nk  and  vk  given  by  (3.8.b).  Noticing  that  vk  =  o(]*t-il)  218  was  proved  in  Section  4  we 
can  say  that  asymptotically  both  algorithms  are  equivalent.  Moreover,  after  few  iterations  on  the 
DQMM  we  will  be  using  wk  instead  of  sk  and  therefore,  the  reason  why  we  never  had  any  trouble 
updating  with  the  DFP/BFGS  when  the  Hessian  is  positive  definite  only  in  N, 

Updating  with  the  DFP/BFGS  the  inner  product  can  be  negative  or  equal  to  zero  in  the  first 
few  iterations.  In  order  to  handle  this  problem  we  proposed  the  following  algorithm. 

M.D.Q.N. 

For  k=0,l,2,... 


Pi+i  ~  (VfflBi'Vfft)  (5.1.a) 

Mt+i  =  -  (VsiBfvStFvgltfvft  (S.l.b) 

X*+i  =  0k+i  +  fik+1  (5.1.c) 

Bkwk  =  -  Vjl(xk  ,Mk+i)  (5.1.d) 
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Bkvk  —  -  V?A+ i 

(5.1.e) 

—  Wl+  Vk 

(5.1.f) 

y*  —  Vjl(*k  +  sk  Ak+l)  -  Vx  l(xit  Ak+l) 

If  yktsk  >  0  then 

(5-l.g) 

fli+i  =  DFP/BFGS(wt,yi) 

else 

(5.1.h) 

Vk  =  Vil(xk  +  wk  ,Mk+i)  -  Vx  l(xk  ,Mk+i) 

(5.1.1) 

Bk+ 1  —  DFP/BFGS(  whyk) 

end  if. 

(5-l.j) 

xk+ 1  =  Xt  +  sk 

(5.1.k) 

Notice  that  without  steps  (5.1.i)  and  (5.1.j)  the  MDQMM  is  nothing  but  the  DQMM  with  the 
Newton  multiplier  update  formula.  Furthermore,  the  extra  gradient  evaluation  is  made  only  when 
it  is  strictly  necessary.  We  obtain  the  following  result. 

Theorem  5.1:  Let  the  sequence  {xt}  be  generated  by  the  M.D.Q.N.  algorithm.  If 
\x„  -  x\  <  t  and  \B„  -  A2|  <  6 

then  {xt}  converges  to  x,  q-superlinearly  if  A,  is  positive  definite  and  2-step  q-superlinearly  if  A, 
is  positive  definite  only  in  N+ 

Proof:  In  Fontecilla-Steihaug-Tapia  [10]  it  was  proved  that  if  the  Hessian  A.  is  positive  definite 
in  the  whole  space  then  the  DQMM  with  the  Newton’s  update  formula  is  q-superlinear  convergent 
in  x*.  If  A.  is  positive  definite  only  in  N,  then  Theorem  4.4  gives  the  desired  result. 

Q.E.D. 


The  Improved  Diagonalized  Quasi-Newton  Method 

The  main  difficulty  to  implement  the  MDQN  is  that  we  do  not  know  when  to  switch  algorithms. 
The  Hessian  A,  may  not  be  positive  definite  and  we  may  still  have  yksk  >  0.  We  construct  an 
algorithm  that  does  not  have  this  inconvinient.  The  idea  was  given  by  the  Coleman  and  Conn  [5] 
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algorithm  although  they  were  not  able  to  prove  l-step  q-superlinear  convergence.  At  same  time 
the  same  idea  was  given  by  Gabay  [11]  but  the  proof  of  q-superlinearity  was  incomplete.  The 
algorithm  is  a  modification  on  the  2-step  algorithms  ALG1/ALG2. 

I.D.Q.N. 

For  k=*0,l,2,... 


X*+i  =  -  (5.2.a) 

Bkwt  =  -  Vjl(xk  ,Xk+i)  (5.2.b) 

Vk  —  V*l(xk  +  wk  ,Xk  +1)  -  Vx  l(*k  X  +1)  (5.2.c) 

Bk+l  =  DFP/BFGS{wk,yk)  (5.2.d) 

vk  —  -  V0$  5(xk  +  wk)  (5.2.e) 

xk+i  —  x  k+  wt+  vk  (5.2.f) 


The  only  difference  with  ALG1/ALG2  is  step  (5.2.f)  where  we  are  doing  one  extra  function 
evaluation  on  the  constraints.  With  this  extra  function  evaluation  we  are  able  to  prove  that  the 
sequence  {lct+=  xk  +  wk)  converges  1-step  q-superlinearly. 

Before  stating  the  theorem  we  need  to  clarify  certain  points.  We  are  assuming  Al,  A2,  and  A4; 
moreover,  we  know  that  the  sequence  {z*}  converges  2-step  q-superlinearly.  Therefore,  since 
~xkj=  xk  +  wk  we  have 

l*«T  z«l  ^  |zt  -  xj  +  M  (5.3) 


since  wk~* 0  and  xk~*x,  we  have  convergence  of  the  sequence  {zt}.  We  also  need  to  point  out  that 
wk  6  Nk  hence  we  have 


..  |  PlBk-A.)w^  n 

lim  - 1 — - —  0 

k  -*  oo  I  U?ij 


(5.4) 


Let  us  recall  from  Fontecilla,  Steihaug  and  Tapia  [10]  that  the  operator  Hc  defined  by 


HJ(x)  —  P ,Vjl(x  X  )  +  c  Vg.  g  (x  ) 


satisfy  H^x,)  =  0  and  H  J(x,)  is  nonsingular.  We  will  use  the  following  notation 


Theorem  5.2s  Assume  A1  thru  A4.  Then  the  sequence  {zt}  generated  by  the  EDQN  algorithm 
converges  q-superlinearly  to  z*  i.e. 


lim 

k  -+  00 


l*t+i  ~ 

I**  "  *«l 


—  0. 


Proof:  Let  us  recall  that  our  system  can  be  written  as 


(5.5) 


PABk  +  Vjita  ,Xk+i))  =  o. 

Consider  now 

—P A7jl(xk  +i,X,  )  =  (Pk  —  P,  )Vx  l(xk+1,X.  )  —  Pk  [Vx  l(xk+i,X.  )  -  Vx  l(xk  ,X,  )  —  A,  wk ) 

+  P^Bk  -  A,)wk. 

Using  the  same  techniques  as  in  Fontecilla,  Steihaug  and  Tapia  [lOj  we  get 

~P «Vjl(xk+i,X»  )  -  c  VS»  8k  +i  ~  (f*k  ~  P*  )[Vx  l(xk  +i,X«  )  -  Vx  ^(x,  ,X,  )) 

-  PjVxl(xk+i,X.  )  -  Vx  l(xk  ,X.  )  -  A.  wk  ) 

+  Pt{Bk  -  A,)u;t  -  cvg,~g^i. 

Taking  norms,  using  the  triangle  inequality,  and  standard  arguments  on  the  left  hand  side  there 
exist  positive  constants  K\,  K&  Kt  such  that 

|"*t+i  -  <  ATi| Pk-PiX*t>H  * 

+  \PABk-A.)wJi  +  KJiw\.  (5.6) 

We  have  that  |zt+1  -  z,|  <  j u/J  +  |z*-  zj.  The  relation 

wk  =  -  ^Vjlfxk  Ak+i)  —  -  Bk  *[Vx  l(xk  ,Xk+j)  -  Vx  l(x.  >X»  )]  -  Bk  'vs*  (Xk+j  -  X,  ) 
together  with  the  fact  that  the  multiplier  update  is  x-dominated  yield 
H  <  Kt\xk-  x\ 

for  some  positive  constant  KA.  Using  the  fact  that  lim  Pk  =  P,  we  get  with  (5.7)  in  (5.6) 


(5.7) 
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Pw  -  **l  £  -  x¥  +  IWi  -  A.)wJ  +  *8fri+i| 

for  a  positive  constant  K&  We  now  need  to  get  an  estimate  on  the  last  term  of  the  right  hand 
side.  Since  v?ki  =  0  we  can  write  ji+1  as 

01+1  =  ht+l  ~  9k~  V 9kwk  +  9k 


so  we  get 


frn.il  <  *eN*  +  fril 


(5.8) 


but  we  also  have  gt  =  9k~~9k~  and  vt-i  —  xk~  ^Therefore, 

\9>\<  Klxt-xf.  (5.9) 

Now  (5.8)  and  (5.9)  yield 


|*t+i  -  *J  ^  Kiixk  -  *<|2  +  I PiBk  ~  ^»ki|  +  Ag| xt  -  zj. 


(5.10) 


F urthermore,  xt  —  xt  +  tw  —  ~xt-  V0M  Jt  hence 

-  *t|  <  Kj*k  -  x\  and  |zt  -  *.|  <  Kio|*t  -  x\. 


Using  those  two  inequalities  in  (5.10)  we  get 


|*<t+i  -  *4  <  ATu|z*  -  z»J2  +  |Pt(5t  -  ^,)u'il- 


(5.11) 


Since 


1 _ <_J<1 o_ 

|*i-  *4  “ 


< 


Aio*< 

M 


dividing  by  |zt  -  x,|  (5.11)  yields 


l*t+i  ~  *«l 
l*i  -  *J 


<  Anj*t  -  *«1  +  A12 


N 


Therefore  the  sequence  {zt}  converges  q-superlinearly  to  x,  if  the  second  term  on  the  right  hand 
side  goes  to  zero,  which  is  true  since  wt  6  N*. 


Q.E.D. 
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8.CONCLUSIONS 


We  have  proposed  a  modification  of  the  Diagonalized  Quasi-Newton  Multiplier  Method  when  it  is 
used  with  the  Newton’s  multiplier  update  formula  and  the  matrices  are  updated  with  the 
DFP/BFGS  secant  updates.  In  case  the  Hessian  is  positive  definite  it  was  proved  in  Fontecilla- 
Steihaug-Tapia  [10]  that  the  method  generates  a  sequence  {it}  which  converges  to  x,  1-step  q- 
superlinearly.  Assuming  this  time  that  the  Hessian  is  positive  definite  only  in  the  null  space  of 
we  were  able  to  construct  a  new  class  of  algorithms  called  2-step  algorithms  which  generate  a 
sequence  {xk}  that  converges  2-step  q-superlinearly  to  x„  The  algorithms  cost  one  extra  gradient 
evaluation  over  the  standard  DQMM.  We  also  proposed  two  algorithms.  The  Modified  diagonal¬ 
ized  quasi-Newton  method  which  is  a  combination  of  the  DQMM  with  a  2-step  algorithm.  The 
main  feature  of  this  algorithm  can  be  seen  in  the  following  situation.  Suppose  we  are  using  the 
DQMM  and  suddenly  we  are  unable  to  update  the  BFGS  or  the  DFP,  for  instance  if  yksk  <  0, 
then  we  shift  to  a  modified  DQMM  which  guarantees  that  the  rate  of  convergence  is  at  worst  2- 
step  q-superlinear.  The  price  we  pay  for  the  shifting  is  one  extra  gradient  evaluation. 

This  latest  modification  has  the  following  drawback.  It  may  be  that  the  inner  product  ykst  is 
strictly  positive  during  the  whole  process  and  the  Hessian  may  not  be  positive  definite.  Therefore 
the  need  to  find  other  ways  of  detecting  whether  we  need  to  shift  to  a  2-step  algorithm  or  keep 
using  the  DQMM.  In  order  to  overcome  this  difficulty  we  also  proposed  an  algorithm,  the 
Improved  diagonalized  quasi-Newton  method,  which  guarantees  the  convergence  of  a  sequence  1- 
step  q-superlinearly  even  when  the  Hessian  is  not  positive  definite.  This  algorithm  is  the  only  one 
to  our  knowledge  that  share  these  features.  It  costs  one  extra  gradient  evaluation  and  one  extra 
function  evaluation  on  the  constraints  over  the  DQMM. 

We  feel  that  all  the  proposed  algorithms  need  some  testing.  At  the  same  time  we  think  that  what 
we  have  developed  constitutes  a  good  start  towards  finding  global  convergent  algorithms. 
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